Temporal difference learning in complex domains
نویسنده
چکیده
...................................................................................................................................................1 TABLE OF CONTENTS............................................................................................................................ iii LIST OF TABLES........................................................................................................................................vi LIST OF FIGURES.....................................................................................................................................vii
منابع مشابه
Some Explorations in Reinforcement Learning Techniques Applied to the Problem of Learning to Play Pinball
Historically, the accepted approach to control problems in physically complicated domains has been through machine learning, due to the fact that knowledge engineering in these domains can be extremely complicated. When the already physically complicated domain is also continuous and dynamical (possibly with composite and/or sequential goals), the learning task becomes even more difficult due t...
متن کاملControl of Multivariable Systems Based on Emotional Temporal Difference Learning Controller
One of the most important issues that we face in controlling delayed systems and non-minimum phase systems is to fulfill objective orientations simultaneously and in the best way possible. In this paper proposing a new method, an objective orientation is presented for controlling multi-objective systems. The principles of this method is based an emotional temporal difference learning, and has a...
متن کاملTDγ: Re-evaluating Complex Backups in Temporal Difference Learning
We show that the λ-return target used in the TD(λ) family of algorithms is the maximum likelihood estimator for a specific model of how the variance of an nstep return estimate increases with n. We introduce the γ-return estimator, an alternative target based on a more accurate model of variance, which defines the TDγ family of complex-backup temporal difference learning algorithms. We derive T...
متن کاملSpeeding up Tabular Reinforcement Learning Using State-Action Similarities
One of the most prominent approaches for speeding up reinforcement learning is injecting human prior knowledge into the learning agent. This paper proposes a novel method to speed up temporal difference learning by using state-action similarities. These handcoded similarities are tested in three well-studied domains of varying complexity, demonstrating our approach’s benefits.
متن کاملTransfer of Knowledge Structures with Relational Temporal Difference Learning
The ability to transfer knowledge from one domain to another is an important aspect of learning. Knowledge transfer increases learning efficiency by freeing the learner from duplicating past efforts. In this paper, we demonstrate how reinforcement learning agents can use relational representations to transfer knowledge across related domains.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999